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ON NORMAL DOMINATION OF 
(SUPER)MARTINGALES 

IOSIF PINELIS 


Abstract. Let (So, Si,...) be a supermartingale relative to a non- 
decreasing sequence of cr-algebras (H<o, H<i, ...), with So < 0 al¬ 
most surely (a.s.) and differences JQ := Si — Si- 1 . Suppose that 
for every i = 1,2,... there exist H<(i-i)-measurable r.v.’s Ci-i and 
Di -1 and a positive real number Si such that Ci -i < Xi < Di-i and 
Di-i — Ci -1 < 2 Si a.s. Then for all real t and natural n 

E ft(S n ) < E ft(sZ), 

where ft(x) := max(0,a; — t) 5 , s := \/ s\ + • • • + s&, and Z ~ N(0, 1). 
In particular, this implies 

P(S n > x) < C 5 ,oP(Z > x/s) Vx € R, 

where C 5 ,o = 5!(e/5) 5 = 5.699.... Results for maxo<K„ Sk in place of 
S n and for concentration of measure also follow. 
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1. Introduction 


The sharp form, 

(1.1) E/(eiaiH-h s n a n ) < Ef(Z), 
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of Khinchin’s inequality for /(x) = \x\ p for the normalized Rademacher sum 
eiai H-1- £ n a n , with 

a i + • • • + af n = 1, 

was proved by Whittle (1960) |5I] for p > 3 and Haagerup (1982) (TTj for 
p > 2; here and elsewhere, the £j’s are independent Rademacher random 
variables (r.v.’s), so that P(e* = 1) = P(e* = —1) = 1/2 for all i , and 
Z ~ 1V(0,1). 

For f{x) = e Xx (A > 0), this inequality follows from Hoeffding (1963) T2J, 
whence 

Ee xz 2 

P (eiai + • • • + e n a n > x) < inf —= £~ x 2 , x > 0. 


A>o e 


Since P(Z > x) ~ — y= e x / 2 (x —* oo), a factor x i is “missing” here. The 
apparent cause of this deficiency is that the class of the exponential moment 
functions /(x) = e Xx (A > 0) is too small (and so is the class of the power 
functions f(x) = |x| p ). 

Consider the much richer classes of functions J 7 ^ (a > 0), consisting of 
all the functions /: R —> R given by the formula 


/ OO 

(x — t)f p(dt), aGl, 

-OO 


where p > 0 is a Borel measure, x + := max(0, x), x“ := (x+)“, 0° := 0. 
It is easy to see ESI Proposition 1(h)] that 


( 1 . 2 ) 


0 < (3 < a implies J 7 ^ C J 7 ^. 


Proposition 1.1. m For natural a, one has f G J 7 ^ if and only if f has 
finite derivatives /^ 0) := /, /^ on R such that f^\— oo) = 

0 for j = 0,1,... , a — 1 and is convex on R. 

It follows from Proposition 11.11 that, for every t G R, every /3 > a, and 
every A > 0, the functions u i—> (u — t)+ and u > e A ( n_t ) belong to , 
while the functions u <—> \u — t\@ and u e-> cosh \(u — t ) belong to J 7 ^. 

Eaton (1970) jBl proved the Khinchin-Whittle-Haagerup inequality mj 
for a class of moment functions, which essentially coincides with the class 
Based on asymptotics, numerics, and a certain related inequality, 


(3) 


r 

Eaton (1974) jJJ conjectured that the mentioned moment comparison in¬ 
equality of his implies that 

P (eiai 4-b e n a n > x) < ^- \=e~ x2 l 2 Vx > V2. 

9 X y/2n 

Pinelis (1994) .22] proved the following improvement of this conjecture: 

2e 3 

(1.3) P (eiai + • • • + £ n a n > x) < P(Z > x) Vx G R, 

as well as certain multidimensional extensions of these results. 
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Later it was realized in Pinelis (1998) that the reason why it is possible 
to extract tail comparison inequality Cl from the Khinchin-Eaton moment 
comparison inequality (11.111 for / E ; is that the tail function u-^P (Z > 
x) is log-concave. This realization resulted in a general device, which allows 
one to extract the optimal tail comparison inequality from an appropriate 
moment comparison inequality. The following is a special case of Theorem 
4 of Pinelis (1999) [23; see also Theorem 3.11 of Pinelis (1998) [23] . 


Theorem 1.2. Suppose that 0 < /3 < a, £ and rj are real-valued r.v. ’s, and 
the tail function u i— > Pfa > u ) is log-concave on R. Then the comparison 
inequality 

(1.4) E/fa) < E/fa) for all f E T { + a) 

implies 

(1-5) E/fa) < c at 0 E/fa) for all f E T[ 

and, in particular, for all real x, 

E/fa) 


W) 


( 1 . 6 ) 

(1.7) 

( 1 . 8 ) 
where 
(1.9) 


Pfa > x) < inf 

feT <“> /(*) 


= B opt (x ) := ^ inf 
< min 


Efa-t)“ 


te(-o o,x) (x — t) a 

in [ c a o Pfa > x), inf e~ hx Ee^ ) , 
\ ’ h >0 ) 


_ r(a + 1 ){e/a) a 
Ca ’P r(/5 + l)(e//3)^ 


Moreover, the constant c a ^ is the best possible in C3 and (Oil . 


A similar result for the case when a = 1 and /3 = 0 is contained in the 
book by Shorack and Wellner (1986) |HU| . pages 797-799. 

Remark 1.3. As folows from !M1 Remark 3.13], a useful point is that the 
requirement of the log-concavity of the tail function q(u) := Pfa > u ) in 
Theorem PI can be relaxed by replacing q{x) = Pfa > x) by any [e.g., the 
least] log-concave majorant of q. However, then the optimality of c(ct, (3 ) is 
then not guaranteed. 


Note that 03,0 = 2e 3 /9, which is the constant factor in Bobkov, 

Gotze, and Houdre (2001) obtained a simpler proof of inequality C3h 
but with a constant factor 12.0099 ... in place of 2e 3 /9 = 4.4634 .... 

Pinelis (1999) f22j[ obtained the “discrete” improvement of Id): 

2e 3 ( 1 

(1.10) P (eiai H- b e n a n > x) < — P —7=(ei H- V £ n ) >x 

9 V V n 

for all values x of r.v. -t={£\ + • • • + £ n ). 
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2. Domination by normal moments and tails 

Theorem 2.1. Let So < 0, Si,... be a supermartingale, with increments 
Xj := Si — Si- 1 , * = 1,2,.... Suppose that for every i = 1,2,... fiiere exist 
H<(i- 1 )-measurable r.v.’s Ci-\ and Di-\ and a positive real number Si such 
that 

( 2 . 1 ) Ci -1 < X,; < Z?i_i and 

(2.2) A-i - Q-i < 2 Si 

with probability 1. Then for all f E and all n = 1,2,... 

(2.3) E/(S n ) < E/(sZ), 
where 

s := \Js\ H + 

and Z ~ JV(0,1). 


The proof of this and other statements (whenever necessary) are deferred 
to Sectional 

By virtue of Theorem II.21 one has the following corollary under the con¬ 
ditions of Theorem rm 

Corollary 2.2. For all (3 E [0, 5], all f E \ and all n = 0,1,... 

(2.4) E/(S n )<c 5 l/ 3 E/( a Z). 


In particular, for all real x, 


(2.5) 

( 2 . 6 ) 

(2.7) 

( 2 . 8 ) 


P(S n >x)< inf 

/ejf) /(*) 


= inf 


E(sZ - t)‘ 


te(-o 0 , 1 ) (x — t) a 

< min fC 5 ,o P(sZ > x), inf 


= min c 5 o $ 


i ex P 



and 

c 50 = 5!(e/5 ) 5 = 5.699... . 


The upper bound exp was obtained by Hoeffding (1963) 2] for 

the case when the Cj-i’s and Z)j_i’s are non-randonr. 

The upper bound J23 - but with constant factor 435 in place of c, 5 ,o = 
5.699... - was obtained in jTj for the case when (Si) is a martingale. 


Theorem 2.3. Let So < 0, Si,... be a supermartingale, with increments 
Xi := Si — Si- 1 , 7 = 1 , 2 ,.... Suppose that for every i = 1,2,... ttiere exist 
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a positive H < u_^-measurable r.v. A-i and a positive real number S{ such 


that 

(2.9) 

( 2 . 10 ) 


< A-i 


and 


If Var^X* 

2 1 0 - 1 + “dut 


x S i 


with probability 1. Let 


(2.11) s \Js±-\ -1-«n' 

Then one has all the inequalities (tOl-HTSll. only with s replaced by s. 

Remark 2.4. Theorem 12.11 mav be considered as a special case Theorem 12.hi 
Indeed, it can be seen from the proofs of these two theorems (see Lemma lo.l.ll 
and Lemma 3.1 in EEj), one may assume without loss of generality that the 
supermartingales (Si) in Theorem 12.11 and 12.31 are actually martingales with 
Sq = 0. Therefore, to deduce Theorem 12.11 from Theorem roi it is enough 
to observe that for any r.v. X and constants c < 0 and d > 0, one has the 
following implication: 

(2.12) EX = 0 & P(c < X < d) = 1 =► VarX < \c\d. 

In turn, implication follows from nsi. which reduces the sitation to 

that of a r.v. X taking on onlyt two values. Alternatively, in light of the 
duality result EH (4)1, it is easy to give a direct proof of (12.121) . Indeed, 
EX = 0 and P(c < X < d) = 1 imply 

0 > E(X - c)(X -d) = EX 2 + cd = VarX - \c\d. 

However, rather than deducing Theorem 12.11 from Theorem 12.31 we shall 
go in the opposite direction, proving Theorem 12 . 31 based on Theorem EH 
Thus, Theorem rm is seen as the main result of this paper. 


Remark 2.5. The set of conditions m- m® is equivalent to 
Xi ' A_i and E^_iX^) ' ■ s f 

with probability 1, where 


1 


2 d>d,Q 


a 


d 


<7*(g?o 5 o) '■= a inf \d 4— - = min <r V do, - do + ~r 


1 


a~ 


do 


a 


h d+4 


if a > do, 
if a < do, 

for positive a and do- This follows simply because the inequalities X* < A-i 
and d > A-i imply X,; < d. 


From the “right-tail” bounds stated above, “two-tail” ones immediately 
follow: 
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Corollary 2.6. Let Sq = 0, S\,... be a martingale, with increments X t : = 
Si — Si- 1 , i = 1,2,... . Suppose that conditions (12.11) and (12.21) hold. Then 
inequalities (12.31) and (12.41) hold for all f G J 7 ^ and f G J 7 ^ (f3 G [0, b\), 
rather than only for all f G J 7 ^ and f G \ respectively. 

Corollary 2.7. Let So = 0, S\,... be a martingale, with increments X{ := 
Si — Si- 1 , i = 1, 2,... . Suppose that condition EH holds, and condition 
ESI holds for \Xi\ in place of Xi. Then inequalities ESI and El with 
s replaced by s hold for all f G J 7 ^ and f G J 7 ^ (/3 G [0, b\), rather than 
only for all f G J 7 ^ and f G F+\ respectively. 

That (So, S\,...) in Theorems 12. Il and l2. 31 is allowed to be a supermartin¬ 
gale (rather than only a martingale) makes it convenient to use the simple 
but powerful truncation tool. (Such a tool was used, for example, in |22| to 
prove limit theorems for large deviation probabilities based only on precise 
enough probability inequalities and without using Cramer’s transform, the 
standard device in the theory of large deviations.) Thus, for instance, one 
has the following corollary from Theorem 12.31 


Corollary 2.8. Let So < 0, Si,... be a supermartingale, with increments 
Xi := Si — Si- 1 , i = 1,2, ... . For every i = 1,2, ..., let Di-\ be a positive 
-measurable r.v. and let Si be a positive real number such that (12.101) 
holds (while ( 12 . 91 ) does not have to). Let s be still defined by ( 12 . 111 ) . 

Then for all real x 


(2.13) 

P(S„ > X) < P (mg X. > l) + min («,„¥ Q 

(2.14) < ^2 P ( X i - A-l) + min fc 5i o $ , 

l<i<n ^ 



These bounds are much more precise than the exponential bounds in 

mum- 


Introduce 


3. Maximal inequalities 


M n := max Sk- 

0 <k<n 


Theorem 3.1. Let (So = 0, Si,...) be a martingale. Then the upper bounds 
on P (S n > x ) given in Corollarv \2.2A and, Theorem, \2. ,71 are also upper bounds 
on P (M n > x), under the same conditions: (12.1(1 - (12.2| and (I2.9I) - (I2.10I) . 
respectively. 


Theorem 3.2. Let 0 < (3 < a and x > t, and let ( S n ) be a martingale or, 
more generally, a submartingale. Assume, moreover, that a > 1. Then, for 






any natural n 

(3-1) 

where 
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e (s n - ty 


e w n - X ) p + < kw ( ;_ t)a :; , 


(3.2) 


ki- a ,p ■= SUPT P( a 

cr>0 


/3s@ 1 ds 
1 + s 


if (3 > 0, and £q(a,0) := 1. The particular cases of (13.11) . corresponding to 
(3 = 0 and (3 = a, respectively, are Doob’s inequalities 

E(S n - t)% 


(3.3) 
and 

(3.4) 


P (M n >x)< 


E(M„)“ < 


a 


a — 1 


(x - ty 


E (S n 


Theorem 3.3. Let (Sq = 0 , S\,...) be a martingale. Then inequalities (12.41) 


and 031 ) hold if S n is replaced there by M n and cs i( g by 


kl\a,(3 

k. 


the same conditions: (12. II) - (12.21) and El-HU), respectively. 

Similarly, results of ;2Bi can be extended 

Remark 3.4. Note that 
f a (3sP- 1 ds 


'a,p 


c 5 ,/ 3 ; under 


1 + s 


= (t !3 2 Fi{(3, 1; 1 + /?; -a) = (3 


[\l-uf~ 1 

Jo 


(1 + ou) P du, 


where 2+1 is a hypergeometric function. Note also that there is some £ 
(0, 00 ) such that the expression under the sup sign in (13.21) is increasing in 
a £ (0,cr ati a) and decreasing in a £ (cr a ,Pi 00 ); this can be seen from the 
proof of Proposition 13.91 Thus, the sup is attained at the unique point a Qi( g. 

Proposition 3.5. Let a and (3 be as in Theorem Vi.tX Then 

r(l + (3)T{a - (3) 


(3.5) 

Remark 3.6. 


— ^ 2 ; a,(3 • 


r(a) 


A?2(ck, 0) = k(a, 0) = 1 = &q(a,0). 


Proposition 3.7. Let 0 < (3 < a, x > t, and 


(3.6) 
Then 

(3.7) 



a — (3) 
a a 


a—/3 


Vit £ M (it — x)+ < k, 


a,(3 


(u ~ t)% 

(x - t) a ~h '■ 


and k ai p is the best constant here. (The values at (3 = 0 are understood here 
as the corresponding limits as (3 j 0.) 
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Proposition 3.8. Let 0 < j3 < a and x > t, and let (S n ) be a martingale 
or, more generally, a submartingale. Then, for any natural n, 

(3-8) E{S n x) + < k a> p _ t) a ~P ’ 

and k a p is the best constant here. 

Proposition 3.9. Let a and (3 be as in Theorem, Ih.Pl Then 

(3.9) k\- a fl <: " &3;a,/9 ■— 

where k aj( g is defined by E23). 



Proposition 3.10. Let a > 1. T/ien 

/ Q( \ a 

(3.10) ki(a,a) = ks(a,a) = - 

\a-lj 

Corollary 3.11. Let a and j3 be as in Theorem, \H. 2L Then 

(3.11) k(a, a) < k\{a, a) < /^(a, a) A k^{a, a); 
at that 


(3.12) 
while 

(3.13) 


k(a, 0) = k\{a, 0) = k-fiot, 0) = 1, 


k\(a, a) = kz(a, a) 


a 

a — 1 


a 

> k(a, a) = 1. 


4. Concentration inequalities for separately Lipschitz 

FUNCTIONS 


Definition 4.1. Let us say that a real-valued function g of n (not necessarily 
real-valued) arguments is separately Lipschitz if it satisfies a Lipschitz type 
condition in each of its arguments: 

(4.1) \g(x 1 ,...,x i -uxi,x i+ i,...,x n ) - g(x 1 , ... ,x n )\ < pfix^xf) < oo 

for all i and all x\,... ,x n ,Xi, where Pi(x t , xf) depends only on X{ and x t . 
Let the radius of the separately Lipschitz function g be defined as 


where 

(4.2) 


r : = 



H-hr 


2 

ni 


n := - sup Pi(xi,xi). 

_ T ■ T . 

O/j 5O-4 


The concentration inequalities given in this section follow from martin¬ 
gale inequalities given in Section [3 Their proofs here are based on the 
improvements given in m and |29| of the method of Yurinskii (1974) |32| : 
cf. fTH, E] and H . 
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Papers 1323, ED, and |2J33 deal mainly with separately Lipschitz function 
g of the form 

g(x i, ...,x n ) = ||xi H-1- x n \\, 

where the xf 1 s are vectors in a normed space; however, it was already under¬ 
stood there that the methods would work for much more general functions g 
- see H3 Remark 1], In a similar fashion, various concentration inequalities 
for general functions g were obtained in HH1 H and p. 

Theorem 4.2. Suppose that a r.v. Y can be represented as a real-valued 
function g of independent (not necessarily real-valued) r.v.’s Xi,, X n : 

Y = g(X 1 ,...,X n ), 

where g is separately Lipschitz with radius r. Then 

(4.3) E f(Y - E Y) < Ef(rZ) for all f G X {5) and 

(4.4) E/(T - E Y) < c 5> p E f(rZ) for all (3 € [0,5] and all f € , 

where Z ~ 7V(0,1). In particular, for all real x, 

(4.5) P(T - E Y > x) < c 5 ,o P (rZ > x) = c 5 , 0 $ . 

Proposition 4.3. Inequalities (EH). (ED), and ED) will hold if the condi¬ 
tions of Theorem \4-'I\ are relaxed so that ri is replaced by 

(4.6) fi := i sup \Eg(xi,...,Xi-i,Xi,X i+ i,...,X n ) 

^ XI ,...,Xi,Xi 

~ Eg(x 1 ,...,x i ,X i+1 ,...,X n )\, 

for every i. Note that f t < ri for all i. 

Remark 4.4. The upper bound given by ED> can be replaced by the tighter 
bound 

min (exp (-|^ ,c 5 ,o$(^)) , 

which is less than exp Jp?) for all 2 > 1.89. 

The foregoing conditions can be modified as follows. 

Theorem 4.5. Suppose that 

(4.7) Ei(x i,.. ,,Xi-i,Xi) := Eg(x i ,... ,x i - 1 ,x i ,X i+1 ,.. .,X n ) 

Eg(x i,..., Xi— i, Xi , Xi^-i , • • •, X n ) 

< Di-i(xi, .. .,Xi- 1), 


(A-i(®1, ■ ■ -,Xi- 1) + 


E^i (.Tl, • • • , Xi — 1, .V i ) \ 

Di-i(xi,... ,Xi-i) J 


— Si, 


(4.8) 
and 

(4.9) 


1 

2 
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for all i and all x \,..., Xj_i, Xi, where Di_\ > 0 depends only on i and 
xi,, Xi- 1 , and Si depends only on i. Let 



Then inequalities (TOll . (TOl) . and (TO) will hold if r is replaced there by s. 

The next two propositions show how to obtain good upper bounds on 
Ej(xi,... ,Xi-i,Xi ) and EHj(xi, ..., Xi-\, Xi) 2 , to be used in Theorem 14.51 

Proposition 4.6. If g is separately Lipschitz so that ED holds, then for 
all i and all x\,..., 



If, moreover, the function g is convex in each of its arguments, then for all 
i and all x\,... ,Xi, 


(4.11) 


(xi, ..., Xi-i,Xi) < Pi(xi, E Xi). 


Remark 4.7. We do not require that pi be a metric. However, the small¬ 
est possible pi, which is the supremum of the left-hand side of ED over 
all x±,..., Xi-\,Xi + i,..., x n , is necessarily a metric. Note also that, for r* 
defined by ED, 



for all Xi, provided the following conditions: (i) pi is the smallest possible 
and, moreover, is a norm; (ii) Xi is symmetrically distributed; and (iii) Xi 
belongs to the support of the distribution of Xi. 

Corollary 4.8. Let here X\,... ,X n be independent r.v. ’s with values in a 
separable Banach space with norm || • ||, and let 


Y := \\X! + ---+X, 


Suppose that, with probability 1 


(4.12) 


Xi — EXj|| < di 


and 


(4.13) 



for all i, where di > 0 and s, > 0 are non-random constants. Let 



Then inequalities ED, ED> ; and ED will hold if r is replaced there by s. 
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5. Proofs 

5.1. Proofs for Section HI Let us first observe that Theorem IP can be 
easily reduced to the case when (S n ) is a martingale. This is implied by the 
following two lemmas. 

The next lemma is obvious and stated here for the convenience of refer¬ 
ence. 


Lemma 5.1.1. Let (S n ) be a supermartingale as in Theorem 1 1 2. 11 so that 
conditions and (EH are satisfied. Let 


Xi .— Xi Ci ~i .— Ci—\ and Di— i •— Di —i E 7 ;—iJQ. 

Then X t is H^-measurable, Ci-i and Di -1 are H < u_iymeasurable, and 
one has 

Xi < Xi, 

Ej_iXj = 0, 

Ci -1 < Xi < Di- 1 , and 

Di -1 — Cj-i < 2si 

with probability 1. 

Proof of Theorem V2.1\ is similar to the proof of Theorem 2.1 in|2Bi but based 
on the following lemma, in place of Lemma 3.2 in|2F;. (Also, one has to refer 
here to Lemma 15.1.11 instead of Lemma 3.1 in[25].) □ 

Lemma 5.1.2. Let X be a r.v. such that EX = 0 and c < X < d with 
probability 1 for some real constants c and d (whence c < 0 and d>0). Let 
Z ~ JV(0,1). Then for all f € h| 5) 

(5.1) Ef(X)<Ef((d-c)Z). 


Proof. This proof is rather long. Let X c ^ be the set of all r.v.’s X such that 
EX == 0 and c ^ X ^ d with probability 1. In view of m (say), for any 
given real t, a maximum of Efi(X) over all r.v.’s X in X c y is attained when X 
takes on only two values, say a and b, in the interval [c, d}. Since the function 
ft is convex, it then follows that, without loss of generality (w.l.o.g.), a = c 

and b = d. ^ Indeed, Eg(aZ) is non-decreasing in a > 0 for Z ~ A^(0,1) and 
any convex function g. One way to verify the latter statement is as follows. 
It suffices to consider the functions of the form g{u) = (u — t) + for real t; 
cf. identity () in Pinelis (1994). But the derivative of E(aZ — t) + in a > 0 
is tp(t/cr) > 0. Alternatively, one can prove that Eg(crZ) is non-decreasing 

in a > 0 by an application of Jensen’s inequality. ^ Moreover, by rescaling, 
w.l.o.g. d — c = 2. In other words, then one has the following: 


X = 


2 r 

2r - 2 


with probability 1 — r, 
with probability r, 
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for some r E [0,1]. At that, 

Y ~ JV(0,1). 

Now the right-hand side of inequality ED can be written as 

(5.2) E f t (Y) = R(t ) := P(i)^(t) - Q(t)$(i), 
where 

P(t) := 8 + 9t 2 + f 4 and Q(t) := t(15 + 10 t 2 + f 4 ), 
and its left-hand side as 

(5.3) Ef t (X) = L(r , t ) := r(2r - 2 - t)% + (1 - r)(2r - *)!>., 
so that ED is reduced to the inequality 

(5.4) L(r, t) < R(t) 
for all r E [0,1] and all real t. 

Note that (15.411 is trivial for t > 2 r, because then L(r, t ) = 0. 

Therefore, it remains to consider two cases: (r, t) E B and (r, t) E C, 
where 


B := {(r, t): 0 < r < 1, t < 2r — 2} and 

C := {(r, t) : 0 < r < 1 , 2 r — 2 < t < 2 r}. 
Case 1 (r,t) E B. Note that in this case t < 0 and, by (15.311 . 

L(r, t ) = r(2r — 2 — t ) 5 + (1 — r)(2r — i) 5 . 

For t 7 ^ 0, one has the identity 

>*) 


^ Q(tf a {R(t) - L(r,t)\ Qi(r, 

(5 ' 5) w a ‘ l, QW J = ^ := 7^) 

where 

Qi(r, t) := Q\t)L{r , f) - Q(i) 9 t L(r, t), 
which is a polynomial in r and t. Note that 


- 120 , 


where 


an/ ^ d r Qi(r,t) j an / ^ 20 Q(t) 

o r Q 2 {r,t) = -771— and d t Q 2 (r,t) = - ——d(r,t), 

<p(t) (p(t) 

,, *Qi(*) + 3tQi(*) 

d(r,t) := - 


20 Q(t) 

is a polynomial in r and t, of degree 2 in r. Therefore, the critical points of 
Q 2 in the interior int B of domain B are the solutions (r, t) of the system of 
polynomial equations 

f d(r,t ) = 0 , 

\d r Qi{r,t) = 0 . 

Further, one has 

d(r, t) = 0 if and only if r = r\(u ) or r = r 2 (u), 
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where 

u := 2 —r—t > 0, r\(u) 


l+u/2 
1 + u 


e (o,i), 


and r 2 (u) 


2 + 2 u + u 2 / 2 
2 + 2 u + u 2 


e(0,i). 


Using the Sturm theorem or the convenient command Reduce of Math- 
ematica 5.0, one can see that the only solution u = u\ > 0 of the algebraic 
equation d r Qi(r, t)| r =n(«),i=2-n(«)-u = 0 is 0.269..., and 
Q2(»',i)|r=r 1 ( ui ),f=2-r 1 (t il )-u 1 < As for the equation 
d r Qi{r,t)\ r=r2 ( u )j = 2 -r 2 (u)-u = 0, it has no solutions u > 0. 

Thus, Q 2 < 0 at the only critical point (r, f) = (rr(tii),2 — rr(ui) — u\) 
of Q 2 in int B. 

Next, with u > 0, 

Qlifi t)\r=0,t=2r-2-u = — 20 ^6 + + U ~)^J < 0. 

Similarly, with u > 0, 

r, t)| r =„. 2 ,- 2 -„ = -20 (e + " 5( ^ + " 2) ) < 0. 


Now consider the function 


92 (r) := Q2(r,t)\t=2r-2- 

Then +(2r — 2)q 2 (r) is a polynomial, whose only root r = rs E (0,1) is 
0.865... . But 92 (^ 3 ) < 0. Therefore, Q 2 < 0 at the only critical point of 
Q 2 in the relative interior of the boundary t = 2r — 2 of domain B. 

Thus, as far as the sign of Q 2 on B is concerned, it remains to consider the 
behavior of Q 2 as t —» — 00 , which is as follows: Q 2 (r,t) ~ 20(2r — l) 2 t' —> 
—00 < 0 for every r / 1/2 and Q 2 (r,t) ~ 40f 3 (5 + t 2 ) —»• —00 < 0 for 
r = 1/2. 

(As usual, a ~ b means a/b —>■ 1.) 

We conclude that Q 2 < 0 on B. Hence, in view of (EH), the ratio 
R ^ t ' > QCt) rt ' > * s decreasing in 1 on 5. 

Next, note that ip{t) and 1 — <3?(t) are o(l/|t| p ) for every p > 0 as t —> — 00 . 
Hence, in view of (15.21) . one has the following as t —* — 00 : R(t) — L(r,t) = 
— Q(t ) — L(r, t ) + o(l) ~ —10(2r — l) 2 f 3 —> 00 for every r / 1/2 and R(t) — 
L(r,t ) = —lOt —> 00 for r = 1/2. 

Hence, < q f or gg^ r g ( 0 ,1) and all t < 0 with large enough 

|t|. Since - g decreasing j n t on B, one has < 0 on B, 

whence L(r, f) < R(t) on B (because Q(t) < 0 on B). 

It remains to consider 

Case 2 (r, f) E C. Here, letting v := 2r — t, one has 0 < v < 2, and, by 
( 15 . 31 ) . 

L(r, t) = (1 — r){2r — t) 5 . 
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Let us use here notation introduced in the above consideration of Case 1. 
Then 

d(r, t)\ t= 2r- v = “(I - r)v 3 (l - < 0 

for (?’, t) = (r, 2 r — v) E int C. This implies that Q 2 has no critical points in 
int C. 

Next, with v > 0, 


Q 2 (Fj t') |r=0,t=2r —v — 20 



U 5 (3 + w 2 )\ 
<P(t) ) 


< 0 . 


On the boundaries r = 1 and t = 2r of C, one has Q 2 = —120 < 0. The 
boundary t = 2r — 2 of C is common with B, and it was shown above that 
Q 2 < 0 on that boundary as well. 

Thus, Q 2 < 0 on C. Since Q(t) = 0 only for t = 0, it follows that the 
ratio R ^Q^ r,t ^ is decreasing in t on C. 

Hence, just as on B, one has that L(r, t ) < R(t) on C- := {(r,t) E C: t < 
0 }. 

Moreover, = > 0 for t = 2 r, since Q > 0 on C + : = 

C \ C- = {(r, t) E C: t > 0}. Because decreasing in t, one has 

R ^Q(t) r ^ ^ 0 on C. |_ and hence L(r,t ) < R(t) on C + . □ 


Proof of Theorem, 1 2. ‘A This proof is similar to the proof of Theorem 2.1 
in m and Theorem rm but based on the following lemma, instead of 
Lemma 3.2 in EHj or 15.1.21 (As in the proof of Theorem 12.11 here one has 
also to refer to Lemma 3.1 in m, rather than Lemma 15.1.Il l □ 


Lemma 5.1.3. Suppose that X is a r.v. such that EX = 0, X < d with 
probability 1, and EX 2 < a 2 , for some positive constants d and a. Let 



Let Z ~ N( 0,1). Then for all f E 

(5.6) Ef(X) < Ef(sZ). 

Proof. In view of (Oh one has C JL( 2 ) . Therefore, by Lemma 3.2 in 
m ,one may assume without loss of generality that here X = d ■ X a , where 
a = a 2 /d 2 . Now it is seen that Lemma 15.1.31 follows from Lemma 15.1.21 □ 

5.2. Proofs for Section |3j 


Proof of Theorem 1X71 Lemma 15 .1.1 1 and Lemma 3.1 in m reduce Theorem 
13.II to the case when (S n ) is a martingale, and then Theorem 13.II follows by 
Doob’s inequality (HOI) . □ 


Proof of Theorem, \3 .HI For every y > t, by Doob’s inequality, 


P (M n >y)<^ 


t)+I{M n > y} 
y-t 
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Hence, letting 
(5.7) J(u) := f 

J X 


f3(y-x)P 1 
y-t 

and using Fubini’s theorem, one has 


d y I{u > x} and a' : = 


a 


a — 1 ’ 


/*oo 

z -(M n -x) 0 + = / (3(y - x)' 3 ~ 1 P(M n > y ) d y 

J X 


< 


J x 


(5.8) 


= E 

= E(5 n - t)+J(M n ) 

< (E(5 n -t)“) V “ (E J(M n ) 


by Holder’s inequality. 

Observe that for all real u 


(5.9) 

Indee 
for u > x, 


y-t 

„AV«' 


J(u) < c 1 / a (u — x) l3 J a , where c := -— ' l,a f . 
vy - v '+ ’ (x - t) a ~P 

Indeed, introducing new variables a := and s := one can see that, 


d y 


J(u) — (x — tf- 1 1° ^ 


1 ds 


+ s 


and 


^/“(u - x)% a ' = kl^a^-^ix - tf- 1 , 


so that (Oil follows, in view of ESI). 
Now (15.81) and Oil imply (ESI. 


□ 


Proof of Theorem Vi. A This is similar to the proof Theorem 15.11 but relies 
on inequality EU in place of Doob’s inequality ESI) . □ 


Proof of Proposition Vi. 5\ Introduce 


/(cr,a,/3,7) := cr 


. = (r ii 

Jo (1 + Sp 


a/7 


0 

K{a,P, 7 ) := sup f (cr, a, /3, 7 ). 

cr >0 

Then a~^^ a f(a, a, f3, y) 1 /® = (ET 7 ) 1 / 7 , where Y := and S is a r.v. with 
density s 1 —> cx _ ^/?s^ _1 /{0 < s < a}. Hence, f(a,a,/3, 7 ) is non-decreasing 
in 7 , and then so is K(a,/3, d). Therefore, 

h-,a,p = K(a,/3, 1) < K(a,/3,a ) = fc 2; a,/3- 

□ 
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Proof of Proposition E3 By (1.1211 . 


(5.10) 

where 


ki-,a,p = supr(fj)", 

<X >0 


r(a) := 


fi?) 




13s@ 1 ds 


g(?) ' v ' Vo i + s 

Note that the monotonicity pattern of 


and g(s) : 


= a)' 


(5.11) 


f(cr) a cr^/" 

riW ;= 7 m = a -n + J 


on (0, oo) is /*\; that is, there exists some oq (a, /3) E (0, oo) such that r\ X 
(is increasing) on (0, ai(a, /3)) and r\ \ (is decreasing) on (< ti ( q :,/ 3 ), oo ); 
namely, here 

(5.12) cri(a,/3) = — 3 —. 

a — p 

Also, gg' > 0 on (0, oo). Hence, it follows from f2B] Proposition 1.9] that 
r has one of these monotonicity patterns on (0, oo): /* or X or /"\ or 
XX or \/*\. However, r(a) is positive on (0, oo) and converges to 0 
when a | 0 as well as when a —> oo. This leaves only one possible pattern 
for r: /*\. Hence, there is some a(a,/3) E (0, oo), at which r attains 
its maximum on (0, oo); moreover, r'(a(a,/3 )) = 0, which is equivalent to 
r(a(a, (3)) = r\(a(a, /?)). Thus, 


ki-,a,p = supr(<r) Q = r(a{a, /3)) a = rq (a(a,(3)) a < suprq(cr) Q 

cr>0 cr>0 

= ri(oq(a, (3)) a = k^ a ,p, 

in view of Em Em and (ET31) . □ 


Proof of Proposition Vi. 1 (A In the case (3 = a > 1, the function rq given by 
Em is increasing on (0, oo) to rq(oo) = XX Hence, so does r, according 
to the mentioned EH Proposition 1.9]. Now Proposition lTlOl follows in view 
of (15.1011 . □ 


Proof of Proposition Elementary calculus; the optimal value of u, when 
inequality (EH) turns into an equality, is 


(5.13) 


ax — (3t 

u * := -— > x. 

a — p 


□ 


Proof of Proposition Only that k a jj is the best constant factor needs 
to be proved. Without loss of generality, x > 0. Suppose that EH holds 
with some constant k in place of k a p\ then, by continuity, it holds for 
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the continuous-time martingale S v := B vAr in place of S n , where £>(•) is a 
standard Brownian motion, v > 0, and 


r := inf{u >0:B v = u* or B v = t}] 

here, tt* is defined by P3l) and t := (—1) A t. Note that Er = u*\t\ and 
p := — > 0. It follows that 


( fZ/c s' f ^(S'oo t)°f J- ( W * 0 + 

p-(u*~ xy = E(Soo - x)q_ < fc——= kp- t -— 


(x — t)“ ^ (x — t)" ^ 

Because k aj p is the best constant in m, it follows now that k > k ay p. □ 


5.3. Proofs for Section^ The proofs here are based on the improvements 
given in EH and [23 of the method of Yurinskii (1974) W2\ - cf. USEE] and 

HI- 

For a r.v. Y as in Theorem ro consider the martingale expansion 

Y - EY = 6 + • • • + Cn, 
of Y — EY with the martingale-differences 

(5.14) & := EjY - E^Y. 

where E; denotes the conditional expectation given H<i := {X\ 

For each i pick an arbitrary non-random Xj, and introduce the r.v. 

(5.15) rji := Y — Yi, where Y* := g(Xi,..., X^i, x i: X i+1 ,..., X n ). 

Proof of Theorem \f. S\ and Proposition m Nore that, for the function E, 
defined by El, one has Ej(Ad ,... ,Xf) = £*, where fi is defined by (15.141) . 
It follows from (EH that 

(5.16) C 2 ,i -i < & < D 2 ,i -1 and D 2 ,i -1 - C 2<i -1 < 2 Y < 2 n, 
where and r, are given by (El) and El- and 

C 2)i -1 := inf Ej_i(—Tft) = inf Ej_iY; - E*_iY and 

Xi Xi 

D 2 ,i -i := sup Ej_i(—7/j) = sup E,_i 1) - Ej_iY 

Xi Xi 

are 77<(i-immeasurable. Now Proposition 14.31 - and hence Theorem 14.21 
follow by Theorem 12.1 l and Corollary 12.21 □ 

Proof of Theorem \4-5\ This proof is similar to that of Theorem 14.21 and 
Proposition m but based on Theorem 12.31 in place of Theorem 12.11 and 
Corollary 12.21 (Note that ESj(xi,..., Xj_i, Xi) 2 is the same as conditional 
expectation E,;_i£? given that X\ = x \,..., i = Xj_i.) □ 

Proof of Proposition E3 For each i, 

(5.17) & = E ?: r/, : - Ej_i? 7 j, 
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because E;Yj = E,;__iYj, in view of the independence of the Xfs. Hence and 
by (EH, for any given Xi, 

(5.18) \rn\ < pi{Xi,Xi) 

with probability 1. It follows from (I5H71) and (I5l8l) that, for any x'j, 

- E^_i%) 2 = Vari_i(Ei77i) < E^_i(E^) 2 < E z _iE^ 2 
= 177z — E 2 — xpi^Xi, xi) = E pi^Xi^xi) , 

which proves EH; here, Var,_i denotes the conditional variance given 
Hi- 1 . 

To prove Em, suppose in addition that the function g is convex in each 
of its arguments, as stated in the second part of Proposition 14.(il Let E,; 
denote the conditional expectation given Xi,, Xj_\. X{ + j ...., X n . Then, 
for all i, by Jensen’s inequality, 

E*-iY = Ei-iEiY = Ei-fcgiXt, ...,X n ) 

P Ej —1 g{X \,..., Xi—i , Ej Xi, Xi+i ,..., X n ) 

— E; 1 g (41 1 , ■ ■ •, X {—\, EYj, , ■ • •, X n ) — Ej_iY, 

in view of EH, if Xi is chosen to coincide with EJQ; hence, 

Ej-ir/j = E ?; _iY - Ej_iY > 0. 

This and formulas EH) and EH imply that 

Ci — E iTji ' EYj), 

which is equivalent to (i4~m . □ 

Proof of Corollary UA This follows immediately from Theorem l4.5l and Prop 1 
sition ITTH with pi(xi,Xi) = \\xi — Xi\\. □ 
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